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ABSTRACT 

The  purpose  of  this  study  was  to  adapt  and  illustrate  the  use  of  a 
computer  program  to  score  binary  patterns  of  response  on  a  short- 
form  predictor  test  (Electronics  Technician  Selection  Test  and  the 
General  Classification  Test)  so  as  to  maximize  the  correlation 
between  this  predictor  and  a  criterion  (die  final  school  grade  in  the 
Basic  Electronics  and  Electricity  School). 
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I.     INTRODUCTION 

The  Navy  has  been  very  much  interested  in  recent  years  in  the 
possibility  of  using  short- form  tests  to  reduce  testing  time  while 
maintaining  or  even  increasing  test  reliability  and  validity. 

The  advantages  of  a  short- form  test  are  manifold.    With  a 
short  but  reliable  and  valid  test  the  Navy  could  save  thousands  of 
dollars  in  training  costs  by  weeding  out,  before  training  even  began, 
those  individuals  who  would  probably  not  succeed.    The  administration 
of  the  test  could  be  done  at  a  training  command,  e.  g.  ,  Naval  Training 
Center,  San  Diego,  Bainbridge,  etc.  ,  or  even  by  a  recruiter.     For 
example,  if  an  individual  desires  to  be  a  radioman  and  talks  to  a 
recruiter  about  joining  the  Navy  only  if  accepted  for  radioman  training, 
it  would  be  advantageous  for  both  the  service  and  that  individual  if  a 
brief  test  of  possibly  five  to  seven  minutes'  duration  could  be  admin- 
istered, graded  and  evaluated  on  the  spot  against  the  individual's 
desires  for  such  a  Navy  career.    With  this  brief  test  both  the  Navy 
and  the  potential  recruit  would  know,  in  a  relatively  short  period 
of  time,  whether  the  man  would  succeed  in  radioman  training. 


II.  BACKGROUND 

Moonan  (Ref.  1)  pioneered  this  type  of  work  for  the  Navy  by 
constructing  a  computer  program  having  the  capability  of  identifying 
combinations  of  test  items  that  have  maximal  validity.     This  program, 
entitled  SEQUIN  (an  acronym  for  Sequential  J_tem  Nominator)  first 
selects  an  item  that  has  highest  validity  with  the  criterion.    The  pro- 
gram then  continues  to  select  another  item  which,  when. combined 
with  the  first,  produces  a  two- item  test  with  a  higher  validity  than 
any  other  two-item  test  that  includes  the  first  item.    This  process 
continues  until  the  required  number  of  items  is  selected  and  the 
maximum  validity  for  this  number  of  items  is  obtained.    The  advan- 
tage of  such  a  program  is  that  a  fairly  long  test,  such  as  the  General 
Classification  Test  (GCT),  might  be  shortened  without  sacrificing 
validity  while  test  time  might  be  significantly  reduced. 

SEQUIN  has  shown,  repeatedly,  that  a  short- form  test  is  at 
least  as  predictive  of  final  school  grade  as  its  long- form  counterpart 
(Ref.  2).    Swanson  and  Rimland  (Ref.  3)  have  found  that  a  short  form 
of  the  GCT,  e.g.  one- half  to  one- third  of  the  original  length,  is  even 
more  predictive  of  recruit  final  achievement  (RFAT)  than  the  complete 
form. 
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III.     THE  PROBLEM 

This  study  attempts  to  increase  further  the  predictive  validity 
of  an  already  brief  test.    The  method,  developed  by  Dr.   R.  A. 
Weitzman  of  the  Naval  Postgraduate  School,  Monterey,  California, 
is  to  weight  item  responses  so  as  to  maximize  the  correlation  with  the 
criterion. 

On  an  n-item  test  where  each  question  is  graded  to  be  either 
correct  or  incorrect,  there  are  2n  different  possible  patterns  of 
correct  and  incorrect  responses.    Thus,  for  example,  on  a  five-item 
test  there  are  32  possible  pattern  scores  as  opposed  to  six  possible 
scores  if  just  the  number  of  correct  responses  were  tallied. 

For  example,  suppose  a  three  question  test  is  given  to  a  group 
of  recruits  in  an  attempt  to  predict  their  success  in  a  Navy  training 
school.    There  are  eight  (2  )  combinations  of  patterns  running  from 
000  to  111  (where  zeros  are  incorrect  responses  and  ones  are  correct). 
A  subject  having  a  pattern  of  101  has  the  same  number  correct  as 
another  subject  with  the  pattern  110,  that  is,  two  out  of  three. 
However,  the  first  individual's  score  might  be  more  predictive  of 
success  in  a  particular  training  school  than  the  second  subject's 
binary  pattern. 

This  study  will  focus  on  four  different  tests  or  test  scores, 
defined  as  follows: 
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1.  Predictor  -  the  predictor  is  a  long- form  test  used  for 
predicting  success  in  a  Navy  training  school.     In  this  study,  the 
predictor  is  the  Electronics  Technician  Selection  Test  or  the  General 
Classification  Test.    Scores  on  the  predictor  are  determined  by 
counting  the  items  answered  correctly. 

2.  Criterion  -  the  final  school  grade  in  the  Basic  Electronics 
and  Electricity  School. 

3.  Total  Correct  -  the  total  number  of  correct  responses  out 
of  the  seven  questions  selected  by  SEQUIN  analysis  for  this  study. 

4.  Pattern  Score  -  a  special  score  assigned  to  each  pattern  of 
responses  on  the  same  seven  items  used  to  compute  total  correct. 
(A  precise  definition  of  pattern  scores  will  be  given  in  Section  VE. ) 

Thus,  the  purpose  of  this  study  was  to: 

1.  Gather  large  pools  of  data  from  a  Navy  training  school, 

2.  Extract  several  suitable  questions  from  the  General  Classi- 
fication Test  (GCT)  and  the  Electronic  Technician  Selection  Test 
(ETST), 

3.  Write  a  computer  program  that  : 

a.  constructs  all  possible  patterns  of  ones  and  zeros  for 
the  number  of  extracted  questions 

b.  calculates  pattern  scores  for  each  individual  pattern 

c.  assigns  pattern  scores  to  subjects 

d.  correlates  the  pattern  scores  of  the  subjects  with 
their  final  school  grades 
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e.  correlates  the  standard  predictor  test  scores  (either 
GCT  or  ETST)  with  final  school  grades 

f.  correlates  total  correct,  with  final  school 
grades 

g.  correlates  pattern  score   with  total  correct 

h.    calculates  a  multiple  correlation  coefficient  between 

a  combination  of  pattern  scores  and  total  correct  and 

final  school  grades 

i.    calculates  test  statistics  for  the  correlations 

j.    calculates  regression  weights  for  predicting  final 

school  grades  from  total-correct  scores 

k.    creates  a  frequency  distribution  showing  number  of 

subjects  with  each  pattern  score 

1.    outputs  all  information  in  an  easy- to- read  form  for 

use  in    future  studies 

4.  Determine  those  patterns  indicative  of  success  for  a  parti- 
cular training  school, 

5.  Test  pattern- score  predictions  by  suitable  cross-validation. 
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IV.     DATA 


All  data  used  in  this  research  were  obtained  from  Mr.  Leonard 
Swanson  of  the  Naval  Personnel  and  Training  Research  Laboratory, 
San  Diego,  California,  and  were  stored  on  nine- track  magnetic  tape 
(Ref.  4).    The  data  consisted  of  the  individual  records  of  approxi- 
mately 2400  trainees  who  started,  but  not  necessarily  finished,  the 
Navy  Basic  Electronics  and  Electricity  School  in  San  Diego.    Each 
trainee's  record  consisted  of  the  equivalent  of  six-computer  card 
records  listing  such  information  as: 

1.  Responses  to  items  on  the  GCT,  ETST,  and  Arithmetic 
Aptitude  Test  (ARI) 

2.  Scores  on  the  GCT,  ARI,  and  ETST 

3.  Navy  service  number 

4.  Enlisted  rating 

5.  Final  school  grade  in  Basic  Electronics  and  Electricity 
School 

Tests  used  as  predictors  included  the  GCT  and  ETST.    The  GCT 
consists  of  60  verbal  analogies  and  40  sentence- completion  items 
with  a  35- minute  time  limit.    The  ETST  consists  of  three  separately 
timed  sections:    math  with  20  items  and  a  25- minute  time  limit; 
science  with  20  items  and  a  15- minute  time  limit;  and  electricity  and 
radio  with  30  items  and  a  20- minute  time  limit  (Ref.  5). 
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Two  sets  of  questions  were  provided  by  Mr.  Swanson  along     •' 
with  their  answer  keys.    The  first  set  of  questions,  consisting  of 
seven  GCT  items,  and  the  second  set,  consisting  of  seven  ETST 
items,  were  selected  using  the  SEQUIN  program,    The  p- values, 
question  types  and  item  validities  are  shown  in  Appendixes  A  and  B. 

The  criterion  consisted  of  final  school  grades  in  the  Basic 
Electronics  and  Electricity  School. 

A.        DATA  PREPARATION 

Two  programs  were  written  to  extract  and  put  into  usable  forms 
all  pertinent  data  for  the  study.     (A  glossary  of  terms  used  in  all 
programs  is  contained  in  Appendix  L. )    The  first  program  checked 
for  completeness  of  an  individual's  record,  i.  e. ,  the  presence  of  six 
computer- card  images,  and  rejected  those  subjects  whose  files  were 
deficient.    Unfortunately  several  records  contained  special  characters, 
e.g.,  dashes,  asterisks,  etc.,  instead  of  integers.    Therefore,  the 
first  data  preparation  program  converted  any  of  these  special 
characters  to  zeros.    Thus,  a  response  other  than  an  integer  from  one 
to  five  was  changed  to  a  zero  and  counted  as  an  incorrect  response. 
If  a  needed  score  such  as  the  GCT,  ETST  or  final  school  grade  was 
blank  or  contained  some  non- numerical  mark,  the  record  for  that 
individual  was  rejected  as  being  incomplete.    (It  is  possible  that  some 
of  those  incomplete  records  resulted  from  subjects  not  finishing  the 
school,  .i.  e. ,  being  reauired  to  leave  the  service  because  of  physical, 
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emotional,  or  academic  problems. )    The  output  from  this  program  was 
written  on  tape  or  data  cell  and  on  paper. 

Appendix  C  is  the  flowchart  of  this  first  program.    Appendix  D 
is  the  program  listing. 

Although  a  subject's  record  consisted  of  six  computer- card 
records,  most  information  was  superfluous.    Of  the  six  cards,  data 
from  three,  at  most,  were  considered.    Using  the  answer  key  supplied 
by  Mr.  Swanson,  the  second  program  graded,  on  different  occasions 
those  ETST  or  GCT  questions  under  consideration.    Specifically,  it 
assigned  a  value  of  one  to  a  correct  response  and  a  value  of  zero  to 
an  incorrect  response.    By  assigning  ones  and  zeros  to  the  responses, 
the  binary  pattern  was  formed.    The  program  also  read  the  criterion 
score  and  the  predictor  score. 

The  output  from  this  second  program  consisted  of: 

1.  binary  pattern 

2.  criterion  score 

3.  predictor  score 

4.  an  in- house  identification  number 

5.  the  subject' s  service  number 

Appendix  E  is  the  flowchart  for  this  second  program.    Appendix  F  is 
the  program  listing. 
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V.     THE  MODEL 

The  main  program  is  divided  into  several  distinct  sections: 
reading  of  data,  determination  of  a  joint  frequency  distribution, 
computation  of  pattern  scores,  assignment  of  pattern  scores  to  sub- 
jects, computation  of  correlation  coefficients  (r's),  computation  of 
test  statistics  for  r  differences,  construction  of  response  patterns, 
ordering  of  response  patterns  according  to  the  scores  computed  for 
them,  calculation  of  a  multiple  correlation  coefficient,  calculation  of 
the  correlation  coefficient  between  pattern  scores  and  total  correct 
construction    of   a    frequency    distribution    showing    the    number 
of    people   with   each   pattern    score,    and   output   (printed    and 
punched). 

A  complete  listing  of  the  program  is  presented  in  Appendix  G. 

A.  DOUBLE  PRECISION  REQUIREMENT 

Because  of  the  large  sample  sizes  and  the  relatively  large 
magnitude  of  several  parameters,  it  was  necessary  to  use  double 
precision  floating  point  numbers. 

B.  THE  DATA  CARD 

The  data  card  initializes  four  variables  that  are  frequently  used 
in  counting  loops  (DO  loops).    The  variables,  Nl,  N2,  N3,  and  N4, 
represent,  respectively,  the  number  of  people  in  the  sample,  number 
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of  elements  in  a  pattern,  the  range  of  criterion  scores,  and  the 
number  of  possible  binary  combinations  using  N2  items  (2^2). 

C.        READING  THE  DATA 

Data  is  read  in  only  a  prescribed  format.     For  this  program, 
the  individual's  data  record  card  is  set  up  as  shown  in  Table  I. 

There  are  two  read  statements.    One  read  statement  carries 
out  the  reading  of  data  that  is  to  be  used  in  the  computations  of  the 
program.     The  other  read  statement  reads  a  dummy  variable,  "IDUM.  " 
By  placing  the  read  statement  involving  IDUM  before  or  following  the 
main  read  statement  (involving  binary  pattern,  criterion,  predictor, 
etc.),'  control  over  alternate  selections  of  data  can  be  attained.    For 
example,  if  odd  numbered  data  were  only  to  be  considered,  the  read 
statement  involving  IDUM  would  follow  the  main  read  statement  thus 
acting  as  a  dummy  procedure  to  control  data  input.     Note  that  all 
input  data  is  in  FORTRAN  "  I  format.  " 

The  variable  "J"  is  used  as  the  DO  LOOP  counter  involving 
personnel  with  only  one  exception.    That  exception  is  in  the  determi- 
nation of  the  joint  frequency  distribution.    An  "I"  DO  LOOP  is  used 
for  all  other  counting  operations. 

At  this  point,  the  total  correct  out  of   the     extracted  questions 
is  calculated.    The  "E"  array  stores  this  information.    This  array  is 
used  in  the  calculation  of  the  sum  of  total  correct  for  all  subjects  and 
the  sum  of  the  squares  of  total  correct  for  all  subjects.    This  infor- 
mation is  later  used  in  the  computation  of  correlation  coefficients 
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TABLE  I 


RECORD  DATA  CARD  SETUP  FOR 
VALIDATION  AND  CROSS-VALIDATION  PROGRAMS 


1 

Column  Number 

Item 

Program  Symbol 

1-7 

Subject's  binary 
pattern 

P(I,  J) 

8,9 

Criterion  Score 

c(j) 

10,11 

Predictor 
(Score  on  ETST) 

A(J)       ■ 

12-15 

In-house  ident. 
number 

D(J) 
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and  the  mean  and  standard  deviation  for  use  in  the  calculation  of  a 
test-retest  correlation  coefficient. 

D.  THE  JOINT  FREQUENCY  DISTRIBUTION 

A  joint  frequency  distribution  is  constructed  using  decimal 
equivalents  of  the  2^2  binary  patterns  and  the  range  of  criterion 
scores.    The  rows  of  the  matrix  (denoted  by  matrix  variable  F) 
represent  the  decimal  equivalents  of  the  binary  patterns,  and  in  this 
case  there  are  128  (2?)  binary  patterns  (the  reason  for  using  seven 
questions  is  explained  in  the  METHODS  section).    However,  the  lowest 
binary  pattern  score  (0000000)  is  also  equal  to  the  decimal  value  zero. 
Therefore,  a  value  of  one  is  added  to  all  decimal  equivalents.    In 
this  way  the  first  row  is  row  one,  not  zero,  and  the  last  row  is  row 
128. 

The  column  numbers  correspond  to  successive  criterion  scores. 
Column  one  of  the  matrix  corresponds  to  the  subjects'  lowest  crite- 
rion score.    In  this  case,  the  lowest  criterion  score  was  30  and  the 
higest  was  99.    The'  matrix  is  represented  in  Figure  1. 

The  "B"  array  is  used  to  store  the  decimal  equivalent  of  an 
individual's  binary  pattern. 

E.  COMPUTATION  OF  PATTERN  SCORES 

The  pattern  score  for  a  pattern  is  the  average  score  of  subjects 
who  have  the  pattern  and  is  calculated  from  the  F  matrix  by  tallying 
the  number  of  subjects  having  the  pattern  and  each  criterion  score. 
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Figure  1 


CONSTRUCTION  OF  A  MATRIX  DESCRIBING  THE 
JOINT  FREQUENCY  DISTRIBUTION 


Columns** 


(30) 
1 


(99) 
69 


Rows* 


*Row  numbers  are  decimal  equivalents  of  binary  patterns  plus  one. 
**Column  numbers  are  criterion  scores  plus  one  minus  the  lowest 

criterion  score.     Numbers  in  parentheses  are  actual  criterion 

scores. 
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This  number  is  multiplied  by  the  criterion  score  and  summed,  and 
the  sum,  SI,  is  divided  by  the  total  number  of  subjects  having  the 
pattern,  S2. 

If  any  of  the  128  patterns  is  not  used,  because  no  one  has  the 
pattern,  both  SI  and  S2  are  set  equal  to  zero,  and  an  arbitrary 
score  of  -1  is  assigned  to  the  pattern. 

Immediately  following  the  computation  of  all  pattern  scores, 
the  scores  are  outputted  on  punched  cards.    The  pattern  scores 
obtained  in  this  study  are  presented  in  Appendix  H. 

F.  ASSIGNMENT  OF  PATTERN  SCORE  TO  SUBJECTS 

A  subject's  decimal  equivalent  to  his  binary  pattern  is  deter- 
mined, and  he  is  assigned  the  pattern  score  for  that  decimal 
equivalent  (the  row  index  corresponding  to  the  pattern  in  the  F 
matrix). 

G.  COMPUTATION  OF  CORRELATIONS 

Correlation  coefficients  are  then  calculated  between  the 
criterion  and  the  predictor  (GCT  or  ETST)  and  between  the  criterion 
and  the  assigned  pattern  scores. 

The  sums  of  criterion  scores  (CI),  pattern  scores  (XI),  and 
predictor  scores  (Al)  are  determined  along  with  the  corresponding 
sums  of  squares  (C2,  X2,  A2).    The  sum  of  the  products  of  the 
criterion  and  predictor  scores  (V),  as  well  as  the  criterion  and  pattern 
scores  (W),  is  also  determined.     The  correlation  coefficients  for 
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pattern  vs.  criterion  (R2)  and  predictor  vs.  criterion  (Rl)  are  then 
calculated.    The  Z  test  statistic  for  the  difference  between  these  r's 
is  also  calculated. 

Three  other  correlation  coefficients  are  computed  later  in  the 
program:    a  multiple  correlation  coefficient  (see  I  below),  the 
correlation  coefficient  between  total-correct  and  pattern  scores,  and 
a  test-retest  correlation  coefficient  used  as  an  estimate  of  the 
reliability  of  total  correct  scores  on  the  predictor. 

H.    CONSTRUCTION  OF  RESPONSE  PATTERNS 

Since  there  are  128  (2')  different  patterns  of  responses  ranging 
from  0000000  to  1111111,  the  computer  is  assigned  the  otherwise 
tedious  and  difficult  job  of  constructing  and  outputting  these  patterns. 
A  difficulty  encountered  is  that  leading  zeros  of  various  binary 
patterns,  although  stored  without  incident  in  the  machine,  are  lost 
upon  printing.    Because  of  this,  all  zeros  in  the  binary  patterns  are 
converted  to  twos.    This  fact  is  noted  on  the  printed  output  (Appendix 
H). 

I.         MULTIPLE  CORRELATION  COEFFICIENT 

The  multiple  ocrrelation  coefficient  indicates  the  strength  of 
relationship  between  one  variable  and  a  linear  combination  of  two  or 
more  others  that  produces  the  strongest  relationship.    Since  different 
predictor  variables  are  sometimes  intercorrelated  and  so  duplicate 
one  another,  the  multiple  correlation  coefficient  depends  on  the 
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intercorr elation  of  different  predictor  variables  as  well  as  on  the 
correlation  of  each  with  the    criterion  variable  (Ref.  6). 

Specifically,   the  multiple  correlation  between  criterion  scores 
and  a  combination  of  pattern  scores  and  total-correct  scores  is 
computed. 

Since  the  coefficient  of  multiple  correlation  considers  the 
inter- relationship  between  the  predictor  variables,  it  should  have, 
theoretically,  a  greater  value  than  the  correlation  between  either 
predictor  and  final  school  grades  alone. 

The  significance  of  the  multiple  r  is  next  computed  using  an  F 
statistic  where  F  is  the  ratio  of  the  variance  of  the  residuals  on  the 
criterion  before  considering  the  multiple  correlation  coefficient  and 
the  variance  of  the  residuals  after  consideration. 

J.         COMPUTATION  OF  REGRESSION  WEIGHTS 

Since  there  is  a  possibility  that  some  binary  patterns  will  not 
be  used  (i.  e.  ,  there  may  be  some  binary  patterns  no  one  has  because 
the  sample  size  is  small  in  relation  to  the  number  of  binary  combi- 
nations), it  is  conceivable  that  an  individual  in  the  cross-validation 
group  might  have  a  pattern  that  no  one  in  the       validation  group 
has.    Correspondingly,  regression  weights  are  computed  from  the 
relation  between  total- correct  and  criterion  scores  in  the  validation 
group  that  are  to  be  used  as  input  in  the  cross-validation  study  to 
determine  scores  for  individuals  having  pattern  scores  equal  to  -1. 
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K.        OUTPUTS  I  AND  II 

The  pattern  responses  are  then  sorted  according  to  pattern 
score  from  the  lowest  (-1)  to  the  highest  (73.  66)  and,  in  conjunction 
with  the  pattern  score  and  total  correct  of  that  binary  pattern,  are 
printed  out  in  tabular  form.    The  table  and  results  thus  obtained  are 
shown  in  Appendix  H  as  Output  I. 

Next,  tables  are  prepared  listing  the  subject's  in- house  identi- . 
fication,  his  predictor  score,  his  final  school  grade  (criterion  score), 
the  pattern  score  associated  with  his  binary  pattern,  and,  finally, 
the  total  correct  scored  out  of  the  seven  questions.    A  sample  showing 
the  first  50  subjects  is  presented  in  Appendix  las  Output  II. 

L.        ADDITIONAL  OUTPUT 

All  correlations  and  test  statistics  computed  during  the 
execution  of  the  program  are  also  printed.    These  results  are  pre- 
sented and  discussed  in  the  RESULTS  section. 
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VI.     CROSS-VALIDATION 

Cross-validation  is  a  method  used  to  estimate  the  magnitude 
of  sampling  variation.    In  cross-validation,  results  are  obtained 
from  a  second  sample  of  people  for  comparison  with  the  results  of 
an  initial  sample.    If  the  results  obtained  from  the  second  sample 
confirm  the  results  of  the  first  the  results  are  said  to  hold  up  under 
cross-  valida  tion. 

In  addition  to  the  validation  or  main  program,  described  in  the 
preceding  section,  this  study  makes  use  of  a  cross-validation  program, 
which  is  essentially  a  portion  of  the  main  program.    It  differs  in  that 
pattern  score  and  regression  weights  derived  from  the  previous 
program  are  read  inwith new  subjects'  personal  data  and  that  patterns 
are  not  constructed,  pattern  scores  are  not  calculated,  and  there  is 
no  need  for  a  joint  frequency  distribution.    The  program  listing  for 
the  cross-validation  is  presented  in  Appendix  J. 

As  can  be  seen  from  Appendix  H,  there  are  fourteen  binary 
patterns  that  were  not  used  by  the  validation  group  in  the  ETST  study. 
Therefore,  the  cross-validation  program  has  to  determine  if  a  subject 
has  a  pattern  that  was  not  used  in  the  validation  program  and,  if  he 
has,  it  must  assign  him  a  score  using  the  regression  weights  deter- 
mined from  the  validation  group  and  his  total- correct  score. 
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A.        ASSIGNMENT  OF  PA TTERN  SCORES 

Various  other  methods  of  assigning  pattern  scores  to  patterns 
that  no  individual  in  the  validation  group  has  were  attempted.    These 
methods  included:    using  the  average  pattern  score  derived  from  the 
main  program,  weighting  more  heavily  those  patterns  appearing 
more  frequently  than  those  appearing  less  frequently,  ignoring  a 
subject  in  the  cross-validation  who  had  a  pattern  no  one  had  in  the 
validation  group  (with  adjustment  of  corresponding  variables,  e.  g. , 
sample  size),  and  finally  using  the  regression  weights. 

With  only  one  exception,  that  of  using  the  regression  weights, 
all  methods  of  attack  failed.    All  pattern-score  validities  were 
significantly  lowered  in  all  the  other  cases.    (The  reason  for  the 
abrupt  drop  in  pattern- score  correlation  coefficients  in  the  cross- 
validation  is  discussed  in  the  RESULTS  section;, ) 

Using  the  regression  weights,  however,  pattern  score  validi- 
ties maintained  a  maximum.    Scores  were  obtained  by  adding  the 
product  of  total  correct  and    the  slope  regression  weight  to  the 
regressed  mean. 

Inputs  for  the  cross-validation  consisted  of  the  same  information 
as  noted  in  the  main  program  plus  the  regression  weights  and  the 
pattern  scores  from  the  main  program. 

The  tabular  results  for  the  first  fifty  subjects  (even  numbers 
only)  is  presented  in  Appendix  K  as  Output  III. 
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VII.    METHOD 

The  validation  and  cross-validation  programs  were  first  used 
on  GCT  data.     Not  only  was  the  GCT  data  analyzed,  but  it  also 
served,  at  the  beginning  of  the  research  effort,  as  a  test  platform  for 
debugging  the  validation  and  cross-validation  computer  programs. 
The  study  concentrated  on  the  ETST  data,  however. 

A.       PRELIMINARY  STUDIES 

GCT  data  were  used  in  preliminary  studies.    Use  of  GCT  data 
as  a  predictor,  as  originally  planned,  was  unsatisfactory  because 
the  GCT  was  not  designed  as  a  predictor  of  success  in  a  training 
school  and,  of  the  seven  questions  considered  in  the  study,  approxi- 
mately one- third  of  the  sample  subjects  had  all  correct,  which  is 
hardly  an  indication  of  predictive  validity. 

The  first  step  in  the  study  was  the  determination  of  the  sample 

size  to  be  used  in  the  validation  and  cross- valididation  programs. 

Since  the  total  number  of  possible  combinations  of  ones  and  zeros 

was  128  (2  ),  it  was  decided  that  an  appropriate  sample  size  in  the 

main  program  would  be  2,000.    This  would  result  in  the  theoretical 

utilization  of  15-16  subjects  per  binary  pattern: 

2,000  subjects 
128  patterns    "  i5,7 
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The  remaining  subjects  in  the  sample  (379)  would  then  be  used  in 
cross-validation  studies. 

Table  II  summarizes  the  results  of  this  first  effort.    As  can  be 
seen  from  these  results,  the  greatest  validity  for  the  validation 
(main)  group  was  obtained  from  the  predictor  vs.  criterion  scores 
(r  =  0.  51).    However,  the  relationship  between  the  pattern  and 
criterion  scores  was  only  0.  44.    Although  lower  than  the  predictor 
validity       coefficient,  it  was  still  better  than  the  r  for  total  correct 
(0.  37).    The  high  absolute  values  of  the  test  statistics  indicate  that 
all  the  differences  were  significant. 

The  cross  validation  tells  essentially  the  same  story.    The 
validities  for  the  pattern  and  total  correct   were    very 
nearly  the  same  as  in  the  validation  program.    However,  the  validity 
for  the  patterns  fell  short  of  its  counterpart  in  the  validation  program 
(rxval  =  •  34  vs.  rvai  =  .44).    This  phenomenon  resulted  from  the 
weighting  of  item  responses  which  maximized  the  correlation  with 
the  criterion,  i.e. ,  minimized  the  error  of  prediction,  thereby 
capitalizing  on  chance  in  the  validation  group.    The  fact  that  chance 
played  an  important  role  in  the  validation  program  was  further 
illustrated  by  one  subject  who  had  a  binary  pattern  with  four  ones, 
i.  e. ,  four  out  of  seven  correct,  but  who  also  had  the  highest  of  all 
pattern  scores. 

Another  explanation  for  the  substantial  reduction  in  pattern 
vs.  criterion  validities  is  that  the  mean  pattern  score  was  used  for 
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TABLE  II 


CORRELATION  COEFFICIENTS  AND 
TEST  STATISTICS  DERIVED  FROM 
THE  GENERAL  CLASSIFICATION 
TEST 


Main 

Cross 
Validation 

r  (pattern) 

0.44 

0.34 

r(predictor) 

0.51 

0.52 

Z 

-2.72 

-3.11 

r(total  ones) 

0.37 

0.36 

Z 

2.74 

-.27 

NOTE:  1.    The  first  Z  is  for  the  difference  between  the  pattern- 

criterion  and  the  predictor- criterion  correlations.    The  second  Z 
is  for  the  difference  between  pattern- criterion  and  total  correct- 
criterion  correlations. 

2.    The  sample  size  in  the  main  study  was  2,000  subjects 
while  the  sample  size  in  the  cross-validation  study  was  379  subjects. 
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individuals  in  the  cross-validation  group  who  had  patterns  no  one  had 
in  the  validation  group. 

B.        ASSIGNMENT  OF  PATTERN  SCORES 

Because  of  the  discrepancies  in  pattern- score  validities  for 
both  the  validation  and  cross-validation  programs,  the  problem  of 
assigning  a  valid  pattern  score  to  an  individual  who,  in  the  cross- 
validation  process,  had  a  pattern  no  one  had  in  the  main  validation 
arose.    Therefore,  the  several  approaches  mentioned  earlier  were 
formulated  and  attempted. 

1.      The  First  Solution 

The  first  of  these  proposed  solutions  involved  the  use  of 
weights  proportional  to  the  number  of  subjects  having  a  pattern. 
The  weights  were  to  be  calculated,  along  with  the  pattern  scores, 
in  the  main  program  and  outputted  on  punched  cards.    The  theory 
behind  this  solution  was  that  if  a  binary  pattern  appeared  very 
frequently  it  should  have  been  counted  more  heavily  in  the  cross- 
validation  than  those  patterns  appearing  less  frequently.    Once  again, 
considering  the  subject  who  had  the  highest  pattern  score  with  only 
four  correct,  it  would  appear  logical  that  that  person  was  not  typical 
and  should  not  have  been  counted  equally  as  others.    That  is,  would 
it  have  been  valid  to  give  his  score  the  same  weight  as  a  score 
that  was  25  per  cent  more  prevalent?    If  both  scores  receive  equal 
weight,  distortion  of  the  validities  must  certainly  occur.    Unfortunately, 
a  suitable  method  of  computing  and  applying  such  weights  was  not  found. 
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2.  The  Second  Solution 

The  second  solution  was  to  reduce  the  number  of  questions 
used  in  the  study  from  seven  to  six.     With  only  six  questions  the 
number  of  binary  combinations  would  have  been  significantly  reduced 
(from  128  to  64)  resulting  in  the  utilization  of  more  binary  patterns. 
It  was  hoped,  in  fact,  that  all  binary  patterns  would  have  been  used. 
Thus,  when  going  into  the  cross-validation  phase  all  patterns  would 
have  had  pattern  scores  and  the  need  for  generating  pattern- score 
substitutes  in  the  cross-validation  could  have  been  eliminated. 
However,  even  with  consideration  of  only  six  questions  (64  combina- 
tions of  ones  and  zeros),  eight  binary  patterns  were  not  used. 
Furthermore,  the  magnitudes  of  the  correlation  coefficients  decreased 
markedly.    Therefore,  this  approach  was  eliminated. 

3.  The  Third  Solution 

The  third  solution  called  for  the  elimination  of  those 
subjects  in  the  cross-validation  who  had  a  binary  pattern  no  one 
had  in  the  validation  study.    The  theory  behind  this  solution  was,  in 
essence,  to  eliminate  the  problem  by  pretending  it  wasn't  there. 
This  solution  was  not  suitable  for  apparent  reasons.     For  a  test  to 
be  valid  in  a  real  environment,  vis  a  vis    a  laboratory  environment, 
it  must  consider  all  contingencies. 

4.  The  Final  Solution 

It  was  finally  decided  to  calculate  regression  weights  and 
use  these  in  assigning  pattern  scores  to  subjects  in  the  cross-valida- 
tion who  had  patterns  no  one  had  in  the  main  program. 
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C.  READ  IN  OF  ALTERNATE  DATA 

The  possibility  of  sample  bias  was  also  considered,  e.  g.  , 
predictor  or  criterion  scores  of  the  entire  sample  could  have  been 
placed  in  order  of  increasing  or  decreasing  magnitude.    Therefore, 
it  was  decided  to  split  the  sample  in  half;  the  first  half  to  be  used 
in  the  validation  program  and  the  second  half  in  the  cross-validation 
program.    The  main  or  validation   program  was  then  designed  to 
read  the  records  of  every  alternate  subject,  e.  g.  ,  every  odd- 
numbered  subject,  and  make  appropriate  calculations  from  those 

data.         The  cross-validation   also    read   every   alternate 
but  complementary  record.    Thus,  for  example,  if  the  main  program 
read  every  odd  record,  the  cross-validation  program  correspondingly 
read  every  even  record.    Unfortunately,  however,  splitting  the 
sample  this  way  resulted    in    a    drop     in  the  number  of 

subjects  per  binary  pattern  from  fifteen  to  approximately  nine. 

D.  THE  ETST  STUDY 

After  solving  the  problem  of  assigning  pattern  scores  to 
subjects  in  the  cross-validation  study,  the  research  focused  on 
utilization  of  the  ETST  as  the  predictor. 

Data  preparation  followed  the  same  procedures  as  those 
noted  in  the  data-preparation  section  of  this  thesis. 

In  addition  to  the  tables  and  correlation  coefficients  computed 
in  the  validation  and  cross-validation  processes,  the  programs  also 
outputted  the  sum  of  total  correct  and  the  sum  of  the  squares  of 
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total  correct  for  all  subjects.    This  was  used  in  the  computation  of 
the  mean  and  variance  for  total  correct  (total  ones).    The  reason 

for  these  calculations  was  to  determine  the  test-retest  correlation 

coefficient. 

1.  Test-  Retest  Reliability  Coefficient 

The  test-retest  reliability  coefficient,  as  described  by 

Weitzman  (Ref.  7),  is  an  estimate  of  the  correlation  between  identical 

versions  of  a  test  taken  by  the  same  persons  in  independent  trials. 

For  a  test  with  n-items  and  a- alternatives,  this  estimate  is: 

rtt  =  1  -  n-M 

is2~  (1) 

where  M  and  S  are  the  mean  and  standard  deviation,  respectively. 

This  estimate  of  the  test-retest  reliability  coefficient 

can  be  used  in  the  determination  of  the  correction  for  attenuation. 

2.  Correction  for  Attenuation 

Because  correlation  results  are  obtained  from  fallible 
measurements,  errors  tend  to  reduce  or  attenuate  the  correlation 
between  traits.    Using  the  formula  for  correction  for  attenuation,  it 
is  possible  to  estimate  what  the  correlation  would  be  if  perfect, 
errorless  measurements  were  available  (Ref.  8).    Correlation 
coefficients  that  are  corrected  for  attenuation  cannot  be  used  in  pre- 
diction equations  but  can  be  used  when  analyzing  relationships  to 
make  allowances  for  random  errors  of  measurement. 

Using  the  test-retest  correlation  coefficient  computed  for 
the  predictor,  it  is  possible  to  calculate  the  validity  of  the  predictor 


34 


corrected  for  attenuation.    The  value  obtained  from  the  following 

formula  is  the  theoretical  correlation  coefficient  if  the  predictor 

were  error- free: 

r     r  =  rpC 
00  °  ^p  (2) 

The  correlation  coefficient  Tp^  is  the  measured  validity  between 
the  predictor  and  criterion,  and  rpp  is  the  test-retest  reliability 
described  in  the  previous  section.    A  comparison  between  the 
validity  coefficient  (rp^)  and  the  validity  coefficient  corrected  for 
attenuation  (r^.  q)  was  used  as  an  indication  of  how  close  this  study 
came  to  the  theoretical  limit  of  validity  for  the  predictor.    Specifi- 
cally, r       (total  correct  vs.  final  school  grade)  was  compared  to 
the  corresponding  correlation  coefficient  corrected  for  attenuation. 
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VIII.     RESULTS 


A.  DETERMINATION  OF  LINEARITY 

A  product- moment  correlation  coefficient  is  good  only  if  a 
linear  relationship  exists  between  the  variables  that  are  being  corre- 
lated.   Figures  2  and  3  are  scatter  diagrams  which  were  used  to 
determine  if  a  linear  relationship  existed  between  total  correct  out 
of  seven  and  final  school  grade  and  the  full  ETST  score  and  the 
final  school  grade.     Note  that  almost  all  the  points  can  be  enclosed 
in  an  oval  which  goes  from  the  lower  left  to  the  upper  right,  there- 
fore indicating  linearity  (Ref.  9). 

B.  CORRELATION  COEFFICIENTS  AND  TEST  STATISTICS 
Table  III  lists  the  values  for  all  test  statistics  and  correlation 

coefficients. 

As  can  be  seen  from  that  table,  the  value  of  ^pattern  score 
decreases  from  0.  76  in  the  validation  program  to  0.  72  in  the 
cross-validation  program,  the  reduction  due  to  maximization  of 
chance  in  the  main  program.    The  computation  of  the  multiple 
correlation  coefficient  was  desired  to  see  if  pattern  scores  add  to 
the  predictive  ability  of  total- correct    scores.       The 
multiple  correlation  coefficient  did  not  increase  the  value  of 
rtotal  correct  ^us  indicating  no  additional  predictive  ability.    The 
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FIGURE    2 
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FIGURE    3 
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TABLE  HI 


CORRELATION  COEFFICIENTS  AND 
TEST  STATISTICS  DERIVED  FROM 
THE  ELECTRONICS  TECHNICIAN 
SELECTION  TEST 


Main 

Cross- 
Validation 

r(pattern  score) 

r(predictor) 

Z 

0.76 
0.61 
7.13 

0.72 
0.60 

4.93 

r(total  correct) 
Z 

0.72 
2.26 

0.73 

-.54 

r(pattern  score/ 
total  correct) 

0.95 

0.95 

r(multiple) 
F 

0.76 
176.36 

0.73 
15.38 

r(test-retest) 

0.73 

Correction  for 
Attenuation 

0.85 

NOTE:  1.    The  first  Z  is  for  the  difference  between  the  pattern- 

criterion  and  the  predictor- criterion  correlations.    The  second  Z 
is  for  the  difference  between  pattern- criterion  and  total  correct- 
criterion  correlations. 

2.    The  sample  sizes  in  both  the  main  and  cross-validation 
studies  was  1,182  subjects. 
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large  value  of  F  indicates  that  the  total- correct  scores  contributed 
significantly  to  the  predictive  ability  of  the  pattern  scores, 
however. 

The  high  value  of  the  correlation  coefficient  between  pattern 
scores  and  total- correct  scores  indicated  that  the  seven  items  used 
in  the  study  constituted  a  very  valid  test  and  that  the  total  correct 
could  be  used  as  a  predictor  that  is  as  good  as  the  pattern  scores 
for  these  items. 

The  correction  for  attenuation  revealed  that  the  highest  validity 
theoretically  obtainable  by  improving  the  reliability  of  the  seven- 
item  predictor  was  0.  85.    The  value  actually  obtained,  0.  73,  was 
equal  to  the  test-retest  reliability  of  the  test.    Since  it  is  not  reason- 
able to  expect  that  a  test  will  correlate  more  highly  with  another 
test  than  it  does  with  itself,  it  is  no  wonder  that  the  pattern  scores 
did  not  correlate  better  with  the  criterion  than  the  total- correct  did. 
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IX.    CONCLUSION 

The  two  FORTRAN  computer  programs  developed  in  this 
study  successfully  determined  and  correlated  pattern  scores  with 
the  criterion.    However,  the  questions  extracted  from  the  ETST 
were  so  highly  valid  that  they  could  have  been  used  alone,  i.e. , 
without  pattern  scoring,  as  predictors  of  success  in  the  Basic 
Electronics  and  Electricity  School. 

It  would  be  interesting  to  continue  this  study  using  biographical 
information,  not  ordinarily  quantifiable,  instead  of  extracts  from 
current  examinations.    Biographical  questions  carefully  constructed 
and  easily  verifiable  could  be  used  in  predicting  behavior,  and  pattern 
scoring  is  a  method  that  can  be  used  to  quantify  responses  to  these 
questions.     Responses  quantified  by  pattern  scoring,  in  fact,  will 
show  the  highest  possible  correlations  with  predicted  behavior. 
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APPENDIX  A 


The  Seven  Best  GCT  Items  Selected  by  SEQUIN 


(1) 

Form  7 

Item 
Number 

(2) 
Item 
Type 

(3) 
Recruit 
p  Value 

(4) 
Median 
School 
p  Value 

(5) 
Median 
School 
Validity 

13 

A 

.77 

.88 

.22 

19 

SC 

.60 

.78 

.20 

31 

A 

.75 

.85 

.20 

55 

A 

.41 

.49 

.24 

62 

SC 

.60 

.69 

.26 

67 

SC 

.80 

.87 

.26 

94 

SC 

.55 

.78 

.30 

NOTE:  1.    Values  in  Columns  (4)  and  (5)  are  based  on  item  data 

only  for  schools  in  which  that  item  was  selected  in  Program  SEQUIN. 
2.    A  =  analogies;  SC  =  sentence  completion  item. 
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APPENDIX  B 


The  Seven  Best  ETST  Items  Selected  by  SEQUIN 


Question 
Number 


Item 
Type* 


Recruit           Median  School  Median 

P- Value  P- Value School  Validity 


3 

M 

.57 

.71 

11 

M 

.38 

.58 

13 

M 

.58 

.69 

22 

S 

.57 

.77 

40 

S 

.21 

.37 

41 

E 

.31 

.39 

50 

E 

.25 

.31 

.34 
.44 
.32 
.34 
.40 
.26 
.28 


*     M  =  Math;  S  =  Science;  E  =  Electricity  or  Radio 
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<■  APPENDIX  "G 
FIRST  DATA  PREPARATION  PROGRAM 


READ  CARDS 

1.2,5,6 


V 


(IW  -  STAR,  BLANK,  DASH 


V 


YES 


IW  -  0 


NO 
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APPENDIX  D 

Listing  of  First  Data  Preparation  Program 
C 

c 

C   THIS  PROGRAM  EDITS  RAW  DATA  FOR  USE  IN  ETST  STUDY 
C 

INTEGERS  CASH,  ZERO,  BLANK,  I W  ,  IC3,  IC4,  IC  5 

DIMENSION  IW(80) 

DATA  DASH/'-  '/.ZERO/'O  V, BLANK/'   '/,IC3/«3  '/,IC4/« 
HC5/«5  '/,IC6/'6  «/tSTAR/«*  »/,ICl/'l  '/,IC2/'2  '/ 

K=l 
C 
C 

C   CHECK  CARD  NUMBER 
10    READC4,4C0,END=500)  IW 

IF{  IW(8)  .EC.IC1 )  GO  TO  12 

IF{  IWC  3)  .EQ.IC2)  GO  TO  12 

IF(  IW(8).E0.IC3)  GO  TO  10 

IF(IW(8) .E0.IC4)  GO  TO  10 

IF(  IW(  8)  .E0.IC5)  GO  TO  12 

IF(  IW(8)  .E0.IC6)  GO  TO  12 

400  F0RMAT(80A1) 
C 

c 

C       ZEROIZE    STARS,     BLANKS,     CASHES 
12  DO    20    1  =  1,80 

IFdWU  )  -EG. STAR)     IW(I)  =  ZERO 

IF(  IWC I ) .EO. BLANK)  I W(  I  )=ZERO 

IFC IWC I ) .EQ.DASH)  IW(I)=ZERO 
20    CCNTINUE 

WRITEC8.300)  IW 

WRITEC6,401)  IW 
300   FORMAT C80A1) 

401  FORMATC 1X,80A1) 
GO  TO  10 

500   IFCK.GE.2)  GO  TO  99 

K=K+1 

GO  TO  10 
99    STOP 

END 
//G0.FT06F001  DD  SPACE= ( CYL , ( 5 , 5 ) , RLS E ) 

//GO.FT04F001   DD   UNI T=2400 , VOL=SE R=NPS416 , DI S P= ( OLD, PASS ) , 
//   DCB=CRECFM=FB,LRECL=8C,BLKSIZE=48G0) , LABEL = ( 1 , NL , , I N ) , 
//    DSN=£F1 

//GO.FT04F002   DD   UNI T=2400 , V0L=SER=NPS416 , D I S P= t CLD, PASS ) , 
//   DCB=(RECFM=FB,LRECL=80,BLKSIZE=4800) ,LABEL=(2,NL,,IN), 
//    DSN=£F2 

//GO.FT08F001  DD  DI SP= (NEW  , KEEP)  , UNIT=2321 , VOL  =  SER=CEL001 , 
//    LABEL=EXPDT=73180,SPACE=(TRK, (571) ) , DSNAME =SC575. KPW2 , 
//    DCB=(RECFM=FB,BLKSIZE=20G0tLRECL=80 ) 
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APPENDIX  E 
SECOND  DATA  PREPARATION  PROGRAM 


READ  CARDS 

5,6 


V 


COMPUTE  RANGE 
OF  CRITERION  SCORES 


V 


/ 


CORRECT  ETST  QUESTIONS 
ASSIGN  1  TO  CORRECT  RESPONSE 
0  TO  INCORRECT  RESPONSE 


V 


BINARY  PATTERN,  PREDICTOR  SCORE, 
CRITERION  SCORE,  IN-HOUSE  I ,D . , 
NAVY  SERVICE  NUMBER 


V 


RANGE   OF 
CRITERION   SCORES 
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c 
c 
c 
c 
c 
c 


10 

1 


c 
c 
c 


3 
2 


C 
C 


APPENDIX    F 

Listing  of  Second  Data  Prepation  Program 

THIS  PROGRAM  EDITS  DATA  FROM  THE  ETST  TEST 

At J)=PREDICTOR,C(J)=CRITERION,D(J )=SERV.NO., 

IMPLICIT  INTEGER*4(A-Z) 

DIMENSION    A  (2500),  C  (2  5  00) t 0(2500  ),W(7) 

DATA    ISAMPtNREADtNWRITEtNPUNCHt I HI , I LO/240  0 ,8 ,6 ,7 , 
156,56/ 

DO    100    J=l  ,ISAMP 

IF( J.E0.2398)     GO    TO    50 

READ(NREAD,1,END=50)     B,D(J) ,KDNUM,( W( I )  ,1  =  1,7) 

FORMAT(A1,I6,I1,T11,I1,T19,I1,T21,I1,T30,I1,T48,I1, 
1T49,I1,T58,I 1) 

THIS    PHASE    CHECKS     FOR    THE    PRESENCE    01=    CARD    NUMBERS 
FIVE    AND    SIX 

IF(KDNUM.E0.5)     GO    TO    3 

GO    TO    10 

K=l 

READ(NREAD,2)    KDNUM , A ( J ) , C ( J ) 

FORMAT (T 8, II ,T52 , 12 ,T64 , 12 ) 

IF(KDNUM.EQ.6)     GO    TO    7 

GO    TO    10 

K  =  K+1 

IFCK.NE.2)     GO    TO    10 


THI 


C 

C      THI 
C       OF 
12 


S    PHASE    CETERMINES    RANGE 
IF(C(J) .EO.O)     GO    TO    12 
IF(CCJ).LT.ILO)     ILO=C(J) 
IF(C( J}.GT.IHI)     IHI=C(J) 


OF    CRITERION    SCORES 


S  PHASE  DETERMINES  CORRECT/ I NCORRECT  RESPONSES 
SELECTED  ET-ST  ITEMS  NOS.  3,11,13,22,40,41,  ANO  50 
NE.5)  W(1)=0 
NE.5)  W(2)=0 
NE.5)  W(3)=0 
NE.2)  W(4)=0 
NE.3)  W(5)=0 
.NE.2)  W ( 6 ) = C 
W(7)=0 


C 

c 
c 


20 


IF(W( 1) 
IF(W(2 ) 
IF( W(3) 
I F  ( W  (  4  ) 
I F  ( W  (  5 ) 
I  F  ( W  (  6 ) 
IF1W(7)  . 
DO  20  1= 
I  F  (  W  (  I  )  . 
W(I)=1 
CONTINUE 


NE.3) 

1,7 

EO.O) 


GO  TO  20 


OUT 


30 


40 
100 
50 
25 


//GO. 

//GO. 

//GC. 

// 

//GO. 

// 

// 


PUT  IN 

IS  US 

WRITE ( 

FORMAT 

WRITE( 

WRITEl 

FORMAT 

CONTIN 

WRITE! 

FORMAT 

STOP 

END 

TQ6F00 

TC7FC0 

T08F00 

CB=(RE 

T04F00 

ABEL=E 

CB=(RE 


FG  ONTO  CARDS,  DATA  CELL,  PAPER 

ED  AS  INHOUSE  ID 

N WRITE, 30)  (W(I),I=1,7),C(J),A(J) 

(4(1X,7I1,I2,I2,I4,T20,A1,I6,5X)) 

NPUNCH.4  0)  (Wll)  ,1=1,7)  ,C  (J) ,A( J) 

4,40)  (W(I),I=1,7),C(J),A(J),J,B, 

(711,12,12, I4,T20,A1, 16) 

UE 

6,25)  IHI,ILO 

(•  IHI  EQUALS' ,12,/'  I LO  EQUALS •» I2f / ) 


,J,B,D( J) 

,J,6,D(J) 
D(  J) 


1    DD    SPACE=(CYL, (5,5) ,RLSE) 

1    DD    SYSOUT=B 

1    DD    DSN=S0575.KPW2,UNIT=2321,V0L=SER=CEL001, 

CFM=FB,BLKSIZE=2  0J0,LRECL  =  80)  ,  DIS  P=  (  CLD,  KEEP  J 

1    DD    DiSP  =  (NEWtKEEP>  ,UNI  T=2321  ,A/OLUME  =  SER=CEL001 

XPDT=73180,SPACE=(TRK,  (571)  )  ,  D"SNAM  E=  S0575  .  KPW3 » 

CFM=FB,BLKSIZE=2OO0,LRECL=80) 
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C  APPENDIX    G 

9  Listing  of  Validation  Program 

C 

C      THIS  PROGRAM  WORKS  ON  ODD  NUMBERED  QUESTIONS  FROM  THE 

C      ETST  EXAM.  THE  MULTIPLE  CORRELATION  COEFFICIENT  IS 

C      CALCULATED  AS  WELL  AS  THE  CORRELATION  COEFFICIENT 

C      BETWEEN  PATTERN  AND  T  OTAL  CORRECT. 

C 

C 

INTEGER*4  A,C  ,  D, E , G ,H, P 

REAL*8  C1,C2,A1,A2,V,W,X,R1,R2,R3,R4,R5,Q,Z1,Z2,Z3,Z, 
1R1STAR,AA,BB 

DIMENSION  A (1200), B(12  00),C(12  00),D(1200),E{12  00), 
IF ( 128, 47). G( 12 3) ,H(128) ,P (7 , 1200 ) ,S { 128 ) ,X ( 1200) 

DATA  N1,N2,N3,N4/1182,7,47,128/ 

L  =  0 
C 
C 

C      READ  IN  DATA 
C 

DO  13  J=1,N1 

READ(9,9HP(I,  J ),I=1,N2),C(J),A(J),D(J) 
9     F0RMAT(7I1 ,12, 12, 14) 
C 
C 

C      IDUM  IS  A  DUMMY  VARIABLE  CONTROLLING  THE  READING  OF 
C      EITHER  EVEN  OR  ODD  CATA. 
C 

READ(9,2)  IDUM 
2     FORMAT(U) 
13    CONTINUE 
C 
C 

C      COUNT  TOTAL  ONES  FOR  EACH  SUBJECT 
C 
5     DO  15  J=1,N1 

E(  J)  =  0 

DO  12  1=1, N2 

E( J) =P( I  ,J)+E( J) 
12  CONTINUE 

15  CONTINUE 

IA1  =  0 

IA2=0 

IV=0 

DO    8    J=1,N1 
C 
C 

C  IA1,    IA2,     IV    ARE    VARIABLES    TO    BE    USED    IN    THE    CALCU- 

C  LATION    OF    R(TOTAL    ONES)     LATER    IN    THE    PROGRAM. 

C 

IA1=E( J)+IA1 

IA2=E{ J)*E( J)+IA2 

IV=C(J)*E( J)+IV 
8  CONTINUE 

C 
C 

C  OUTPUT    VALUES    FOR    I Al ,     IA2    TO    BE    USED    IN    THE 

C  COMPUTATION    OF    TEST-RETEST    CORRELATION    COEFFICIENT. 

C 

WRITE(6,999)     IA1,IA2 
999       FORMAT (T 20, »IA1=«,I8,//,T2  0,«IA2=,,I8) 
C 
C 

C  DETERMINE    THE    JOINT    FREQUENCY    DISTRIBUTION    OF    PATTERN 

C  AND    CRITERION    SCORES    (THE    SECOND    I    LOOP    CONVERTS 

C  BINARY    NUMBER    PATTERNS    TO    DECIMAL    EQUIVALENTS 

C  TO    SERVE    AS    ROW    ADDRESSES. 


C 


DO  14  I=1,N4 
DO  17  J=1,N3 
F(I,J)=0 
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Listing  of  Validation  Program 
(Continued) 

17  CCNTINUE 
14         CONTINUE 

CO    18    J=1,N1 

M  =  l 

K  =  N4 

CO    19     1=1, N2 

K  =  K/2 

M=K*Pl  I, J)+M 

19  CCNTINUE 
N=C( JJ-29 
F(M,N)=F(M,N)+1 
B(J)=M 

18  CONTINUE 
C 

c 

C      CCMPUTATION  CF  PATTERN  SCORES 
C 

CO  20  1=1, N4 

51  =  0 

52  =  0 

DO    21    J  =  1,N3 
S2=F(I ,  J)  +  S2 
Sl=( J+29}*F( I,  J)+S1 
21  CONTINUE 

IFCS2.FQ.0)    GO    TO    10 
S(  I)  =  S1/S2 
GO    TO    20 

io      sm=-i 

20  CONTINUE 

WRITE { 7,25 ) (S( I) ,P=1,N4) 
25         FORMAT!  10F7-  4) 
C 
C 

C  ASSIGNMENT    OF     PATTERN    SCORES    TO    SUBJECTS 

C 

CO   31    J=1,N1 

K=B(J) 

X{  J)  =  S(K) 
31  CCNTINUE 

C 

c 
c 

C  COMPUTATION    OF    CORRELATIONS 

C 

A1=0.D0 

A2=0.D0 

C1=0.D0 

C2=0.D0 

X2=0.00 

X1=0.D0 

V=O.DO 

W=O.DO 

UU=0. 

DO    41    J=1,N1 

C1=C( J)+C1 

C2=C(J )*C( J)+C2 

A1=A(J)+A1 

A2=A(J)*A{ J)+A2 

X1=X( JJ+X1 

X2=X(J)*X(  J)+X2 

V=CtJ )*A(J )+V 

W  =  C(  J)*X(J )+W 

UU=E(  J)*X(  J)+UU 
41  CONTINUE 

R3=(N1*C2)-(C1*C1) 

R2  =  (Nl*X2)-<  X1*X1) 

R5=(N1*W  )-(Cl*Xl) 

0=(R2*R3)**0.5 

R2=R5/0 
C 
C 
C      FIRST  TIME  THROUGH  PROGRAM  Rl  IS  THE  CORRELATION 
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Listing  of  Validation  Program 
(Continued) 

C  COEFFICIENT    FOR    PATTERN    SCORE    VS.    CRITERION.     SECOND 

C  TIME    THRCUGH    Rl     IS    CORRELATION    COEFFICIENT    FOR    TOTAL 

C  ONES    VS.    CRITERION. 

C 
30         R1=(N1*A2)-(A1*A1) 

R4=(N1*V)-(C1*A1) 

0=(R1*R3)**0.5 

R1=R4/Q 

IF(L.EO.l)    R1STAR=R1 
C 
C 

C  COMPUTATION    OF    TEST    STATISTIC    FOR    R    DIFFERENCE 

C 

Z1  =  (1+R1)/U-R1  ) 

Zl=DL0G(Zl)/2 

IF(L.EQ.l)    GO    TO    40 

Z2=(l+R2)/(1-R2) 

Z2=DL0G(Z2)/2 

Z3=2./(Nl-3.  ) 

Z3=(Z3)**0.5 
40  Z=(Z2-Z1)/Z3 

IF    (L.EQ.l)    GO    TO    90 
C 
C 

C  CCNSTRUCTION    OF    RESPONSE    PATTERNS 

C 

G(l)=2 

H(1)=0 

G( 2) =1 

H(2)=l 

K  =  l 

N=l 

50  N=2*N 
IFIN.GE.N4)    GO    TO    60 
K=10*K 

DO    51    1  =  1.  N 
G(N+I)=G(I )+K 
H{N+I)=H(I )+l 
G(I)=G(I)+2*K 

51  CONTINUE 
GO    TO    50 

C 
C 

C      ORDERING  OF  RESPONSE  PATTERNS 
C 
60    N=N4-1 
70    K  =  0 

DO  80  1=1, N 

V=G( I) 

W=H{ I) 

IF  (St  1+1) .GE.SC I) )  GO  TO  80 

U=S<I ) 

V=G(I) 
■  W=H(  I) 

S(  I)=S( 1+1  ) 

G(  I)=G.(  1+1) 

H(  I)  =  H( 1+1) 

S(I+1)=U 

G(  1  +  1) =V 

H(I+1)=W 

K  =  l 
80    CONTINUE 

IF  (K.EO.l)  GO  TO  70 
C 
C 

C      PRINT  OUT 
C 

NUM=1 

NUM1=32 

99  WRITE(6,100)(G( I),H(I),S(I ) , I=NUM ,NUM 1 } 

100  FORMATC  1'  ,6(/J  ,T60,«ETST  EX  AM  '  ,  2  (/ )  ,T45  ,  •  +  •  , 
838 (• -'),»+• ,/,T45, • I* , IX, 'PATTERN* , 
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Listing  of  Validation  Program 
(Continued) 


c 
c 

C      THE  VARIABLE  R7  IS  THE  CORRELATION  COEFFICIENT  BETWEEN 

C      PATTERN  SCORES  AND  TCTAL  CORRECT  (ONES). 

C 

R7=N1*UU-IA1*X1 
WRITE(6,33)  R7,UU 
33    FORMAT('  R7= » , F 18. 4,  'UU= • , F 18. 4 ) 
37    FCRMATC  R7  =  '  ,  F18  .  4  ,  «UU=  «  ,  F18  .4  ) 
R1=IN1*A2)-(A1*A1) 
R8=(N1-;X2)-(X1*X1) 
0=(Rl*R8)*-*.5 
R7=R7/0 

WRITE(6,213)  R7 
213   FORMATCO*  ,«R(PATTERN  SCORE/TOTAL  ONES)  EQU  ALS  •  t  F6.  3) 
C 
C 

C      RMUL  IS  THE  MULTIPLE  CORRELATION  COEFFICIENT  TO  BE 
C      USED  IN  THE  DETERMINATION  OF  »F'. 
C 

RMUL  =  ( (R2**2)+(R1STAR**2)-2*{R2*R1STAR*R7)  )/(l-R7**2) 
RMUL=RMUL**.5 

FF=((RMUL**2)-tRlSTAR**2))*{Nl-3)/(l-<RMUL**2) ) 
WRITE(6,417)  RMUL,FF 
417   FORMAT (  «0» , «R( MULT.  CORREL.  COEF.)  EQUALS '  ,4X , F6. 4 ,// , 
1«  FF  EQUALS' ,F8.4) 
C 
C 

C      THIS  PORTION  OF  THE  PROGRAM  IS  USED  IN  THE  DETERMINA- 
C      TION  OF  A  FREQUENCY  D I STRI BUT  I ONF  I.E.  PATTERN 
C      VS.  NUMBER  OF  PEOPLE  WITH  THAT  PATTERN 
C 
C 

DO  350  1=1, 128 
WRITEI6.357)  G( I), S(I ) 
357   F0RMAT{T20,' A  PATTERN  OF : •  , 2X , I  7 , 2X , « AND  PATTERN  SCORE 
13X,F8.4,  //) 
KOUNT=0 
DO  355  J=1,N1 

IF(S( I) .EO.X(J ) )  GO  TO  359 
GO  TO  355 
359   K0UNT=K0UNT+1 

WRITE(6,352)  D(J) 
352   F0RMAT(T15.I4) 
355   CONTINUE 

WRITE(6,351)  KOUNT 
351   FORMAT (»  TOTAL  PEOPLE  HAVING  THIS  SCORE :', 13, // ) 
350   CONTINUE 
STOP 
END 
//GO.FT06F001  DD  SPACE = ( C YL , ( 5 ♦ 1 ) ) , SYSOUT=D 
//G0.FT07F001  DD  SYSOUT=B 

//GO.FT09F001  DD  DSN-S0575 . KPW3 , UNIT=2321 , VOL=S ER=CELOO 1, 
//    OCB=(RECFM=FB,BLKSIZE=2  000,LRECL=80) ,DISP=(CLD,KEEP) 
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APPENDIX  H 
Output  I  -  Pattern  Information 


ETST    EXAM 


+ + 

PATTERN  |  TOTAL  ONES   1  PATTERN  SCORE 

2222112 
2222111 
2212121 
2212111 

2 
3 
3 
4 

-1.0000 
-1.0003 

-1.0000 
-1.0000 

2122121 
2122112 
2122111 
2121122 

3 
3 
4 

3 

-1.0000 
-1.0000 

-1.0000 
-1.0000 

2112211 
2112112 
1212111 
1122121 

4 
4 
5 

4 

-1.0000 
-l.DOOO 

-1.0000 
-1.0000 

1122112 
1122111 
2212211 
2222211 

4 

5 
3 
2 

-1.0000 
-1.0000 

48.5000 
49.0000 

2222222 
1222222 
2222121 
2122211 

0 
1 
2 

3 

49.6071 
51.3182 
52.0000 
52,0000 

2221222 
2222212 
1222121 
2222221 

1 
1 
3 
1 

52.5238 
52.5833 
53.5000 
53.6667 

2122222 
2121211 
2222122 
2221212 

1 
4 
1 
2 

53.8000 
54. 003 J 
54.1667 
54.3000 

2212122 
2212222 
1212222 
1222212 

2 

1 
2 
2 

54.50C0 
54.5  5  00 
54.7500 
54.8889 

NOTE  l:  IN  PATTERNS,  2'SREPRESENT  0« S 
NOTE  2:  A  PATTERN  SCORE  OF  -1  INDICATES  A  PATTERN 


NO  ONE  HAS 
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APPENDIX  H 
Output  I  -  Pattern  Information 


ETST    EXAM 


+ ■ + 

PATTERN  |  TOTAL  ONES   1  PATTERN  SCORE 

2221221 
2112111 
1222122 
2122221 

1       2 
5 
2 
2 

56.0000 
56.0000 
56.3333 
56.5000 

2122212 
1222112 
1212122 
2121212 

2 

3 
3 
3 

56.5000 
56.5000 
56.8000 
56.8750 

2212212  1       2 
1122222         2 
1221222         2 
1122221  1       3 

57.2500 
57.3333 
57.3913 
57.6000 

2211222 
2221121 
2121222 
2112212 

2 
3 
2 
3 

57.7308 
58.0000 
58.0000 
58.0000 

2211212 
1212212 
1221221 
2221211 

3 
3 
3 
3 

58.2941 
58.3077 
58.3333 
58.5000 

2211122 
1212221 
2211221 
2221122 

3 
3 
3 
2 

58.7143 
58.8750 
58.9091 
59.0000 

2211211 
1222111 
1221212 
1221211 

4 
4 

3 
4 

59.0000 
59.0000 
59.2222 
59.2500 

1222221  1       2 
2121112         4 
2112222         2 
1211222  1       3 

59.5000 
59.6667 
59.7143 
60.0682 

NOTE     2 


NOTE    1: 
PATTERN 


IN    PATTERNS,    2'SREPRESENT    O'S 
5C0RE    OF    -1     INDICATES    A    PATTERN 


NO    ONE    HAS 


(Continued) 
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APPENDIX  H 


Output  I  -  Pattern  Information 


ETST    EXAM 


+ + 

PATTERN  |  TOTAL  ONES   1  PATTERN  SCORE 

2111222 
1221122 
2121221 
1121222 

3       I     60.7333 
3       1     60.7503 
3            61.0000 
3       1     61.0000 

1112122 
2212221 
1212211 
2221112 

4       I     61.2500 

2  61.6667 

4            61.6667    i 

3  1     62.0000 

2212112  1       3       J     62.0000 
2122122  i       2            62.0003 
2111212  1       4       1     62.2222 
1211221  I       4       |     62.2308 

2112121 
1212112 
1122212 
1112222 

4       J     62.5000 
4            62.5000 
3        1     62.5714    i 
3       1     62.8621 

2112221 
1112212 
1211212 
1122122 

3  I     63.00CO 

4  1     63.2609 
4            63.2857 
3       1     63.5000 

1121212 
1211122 
1112121 
2121121 

4 
4 
5 
4 

63.5625 
63.6000 
63.7500 
64.0000 

2112122  1        3 
1211211  i       5 
1122211  1       4 
1111222  1        4 

64.0000 
64.0030 
64.2500 
64.4000 

2111122 
1121122 
2211121 
2111221 

4 
4 

4 
4 

64.6000 
64.8333 
65.0000 
65.00C0 

NOTE    l:     IN    PATTERNS,     2'SREPRESENT    0' S 
NOTE    2:     £    PATTERN    SCORE    OF    -1     INDICATES    A    PATTERN 


NO    ONE    HAS 


(Continued) 
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APPENDIX  H 
Output  I  -  Pattern  Information 


ETST    EXAM 


+ + 

PATTERN  I  TGTAL  ONES   1  PATTERN  SCORE 

2211112 
1221112 
1211112 
1121221 

4 
4 
5 
4 

65.2000 
65.2000 
65.300J 
65.3000 

1112211 
1111212 
1112112 
2111121 

5 
5 
5 
5 

65.5000 
65.6097 
65.7500 
65.7778 

1112221 
2111112 
1222211 
1121112 

4 
5 
3 
5 

65.8571 
66.0000 
66.0000 
66.3333 

2121111 
2111211 
1111221 
1111122 

5 

5 
5 
5 

66.5000 
66.7500 
66.7667 
67.7273 

2221111 
1221111 
1211121 
1121211 

4 
5 
5 
5 

68.0000 
68.3333 
68.5000 
68.5000 

1111211 
1111112 
1212121 
1121121 

6 
6 
4 

5 

68.5000 
68.7500 
69.5000 
69.6667 

2211111  1       5 
2111111  |        6 
1112111  1       6 
1211111  1       6 

69.8333 
70.0000 
70.5033 
70.8000 

1221121 
1111121 
1111111 
1121111 

4 
6 

7 
6 

71.0000 
71.2954 
73.2857 
73.6667 

NOTE    l:     IN    PATTERNS,     2'SREPRESENT    O'S 
NOTE    2:    A    PATTERN    SCORE    OF    -1     INDICATES    A    PATTERN 


NO    ONE    HAS 


(Continued) 
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APPENDIX  I 


Output  II  -  Subject  Information 
ETST         EXAM 


+• 

IDENT 

1   PREDICTOR 

1  CRITERION 

1  PATTERN  SCORE 

+ 

I  TOTAL  ONES 

1 

66 

76 

68.7500 

6 

3 

59 

53 

54.7500 

2 

g. 

57 

54 

54.7500 

2 

7 

73 

73 

71.2954 

6 

9 

56 

60 

58.9091 

3 

11 

71 

70 

66.0000 

5 

13 

63 

64 

64.4000 

4 

15 

59 

57 

54.5500 

1 

17 

64 

56 

57.3913 

2 

19 

64 

54 

49.6071 

0 

21 

46 

58 

54.7530 

2 

23 

56 

56 

58.8750 

3 

25 

56 

60 

65.6097 

5 

27 

55 

54 

58.2941 

3 

29 

65 

68 

68.7500 

6 

31 

49 

57 

54.7500 

2 

33 

55 

62 

60.0682 

3 

35 

72 

76 

73.2857 

7 

37 

68 

71 

63.7500 

6 

39 

66 

68 

65. 8571 

4 

41 

73 

71 

70.0000 

6 

43 

65 

60 

64.4000 

4 

45 

59 

59 

61.0000 

3 

47 

63 

76 

73.6667 

6 

49 

65 

64 

62.5000 

4 

51 

54 

38 

51.3182 

1 

53 

66 

64 

62.8621 

3 

55 

58 

62 

57.2500 

2 

57 

51 

59 

57.2500 

2 

59 

1       61 

64 

65.2000 

4 

61 

62 

68 

65.3000 

4 

63 

63 

67 

70.0000 

6 

65 

60 

62 

56.0000 

2 

67 

66 

70 

68.5000 

5 

69 

68 

69 

63.6000 

4 

71 

6Q 

68 

68.5000 

6 

73 

62   • 

60 

54.5503 

1 

75 

58 

60 

56.8750 

3 

77 

65 

59 

54.7500 

2 

79 

58 

58 

61.0000 

3 

81 

62 

57 

56.0000 

2 

83 

63 

71 

69.5000 

4 

85 

59 

58 

57.7308 

2 

87 

70 

76 

73.2857 

7 

89 

59 

70 

65.7778 

5 

91 

51 

61 

60.0682 

3      ' 

93 

51 

49 

52.5238 

1 

95 

45 

60 

57.6000 

3 

97 

66 

68 

68.5000 

5 

99 

71 

65 

68.7500 

6 
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C  APPENDIX    J 

£  Cross-Validation  Listing 

C  b 

C  CROSS    VALIDATION 

C 

INTEGER**    A,C,  D, E,G,H,P 

REALMS  C1,C2,A1,A2,V,W,X,R1,R2,R3,R4,R5,Q,Z1,Z2,Z3,Z, 
1AA,BB 

DIMENSION  A( 1203) ,  B (12  00)  , C { 1200 ), D ( 1200 ), E ( 12  00 )  , 
1G(128),H(128),P(7, 1200) , S( 128) , X( 1200) 
C 

C      NOTE:  PARAMETERS  OF  DATA  CARD 
C 

CATA  N1,N2,N3, N4/1182, 7,47, 128/ 

L  =  0 
C 
C 

C      READ  IN  CATA 
C 

DO  13  J=1,N1 
C 
C 

C      THIS  CROSS  VALIDATIGN  PROGRAM  READS  DATA  FROM  THE 
C      ETST  EXAM 
C 

c 
c 
c 

C      IDUM  IS  A  DUMMY  VARIABLE  CONTROLLING  THE  READING  OF 

C      EITHER  EVEN  OR  ODD  CATA. 

C 

READ(4,2)  IDUM 

2  FORMAT  (ID 

READ (4,9)  (PU,J),I=1,N2),C(J),A(J),D(J) 

9  F0RMAT(7I1,I2,I2,I4) 
13    CONTINUE 

C 

C 

C      READ  REGRESSED  MEAN  AND  WEIGHT  COMPUTED  FROM  THE 

C      MAIN  PROGRAM. 

C 

READ(5,3)  AA,BB 

3  F0RMATIF6.3,F6.3) 
C 

C 

C      READ  PATTERN  SCORES  CALCULATED  FROM  MAIN  PROGRAM 

C 

READ(5, 10) ISU ) ,I=1,N4) 

10  F0RMAT(10F7.4) 
C 

C 

C      COUNT  TCTAL  CORRECT  FOR  EACH  SUBJECT 

C 

DO  15  J=1,N1 
EU)=0 

DC    12    1=1, N2 
E( J)=P( I ,J)+E( J) 
12  CONTINUE 

15  CONTINUE 

-  ""    I A  1  =  0 
IA2=0 
IV=0 
C 

c 

C  IA1,     IA2,     IV    ARE    VARIABLE    TO    BE    USED    IN    THE    CALCU- 

C  LATION    OF    RtTCTAL    CNES)     LATER    IN    THE    PROGRAM. 

C 

CO    8    J=1,N1 

IA1=E( JJ+IA1 

IA2=EU)*E(  JJ  +  IA2 

IV  =  C(  J)*E(  J)+IV 
8  CONTINUE 
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Cross-Validation  Listing 
(Continued) 

i 

c 
c 

C  OUTPUT    VALUES    FOR    I Al ,     AND    IA2    TO    BE    USED     IN    THE 

C  COMPUTATION    OF    TEST-RETEST    CORRELATION    COEFFICIENT. 

C 

WRITE(6,999)     IA1,IA2 
999       FORMAT  IT 20, •IA1=«,I8,//,T2  0,'I£2  =  ,,18) 
C 
C 

C  CALCULATE    DECIMAL    ECUIVALENT    OF    BINARY    PATTERNS 

C  THE    B(J)    ARRAY    HOLDS    DECIMAL    EQUIVALENT    OF    EACH 

C  SUBJECT'S    BINARY    PATTERN. 

C 

CO    18    J  =  1,N1 

fj  =  l 

K  =  N4 

DO    19    I     =1,N2 

K  =  K/2 

M  =  K*P(  I  .J)+M 
19  CONTINUE 

B( J)=M 
18         CONTINUE 
C 
C  ASSIGNMENT    OF    PATTERN    SCORES    TO    SUBJECTS 

DO    3  1    J=1,N1 

K=B(J) 

XC J)=S(K) 

IF(X(  JJ.LT.O)     X(  J)=E(J)*BB+AA 
31  CONTINUE 

C 
C  COMPUTATION    OF    CORRELATIONS 

A1=0.D0 

A2=O.D0 

C1=0.D0 

X1=0.D0 

X2=0.D0 

W=O.DO 

V=O.DO 

UU=0. 

C2=0.D0 

CO    41    J=1,N1 

C1  =  CU)  +  C1 

C2=C(J )*C( J)+C2 

A1=A(J)+A1 

A2=AU)*A(  J)+A2 

X1=X( J)+X1 

X2=X( J)*X( J)+X2 

V=C(J)*A(J)+V 

W=C(J)*X(J )+W 

UU=E(J)*X{ J)+UU 
41  CONTINUE 

R3=(N1*C2)-(C1*C1) 

R2=(N1*X2>-{X1*X1) 
.  R5=(N1*W)-(C1*X1  ) 

Q*IR2#R3)**0.5 

R2=R5/G 
C 
C 

C  FIRST    TIME    THROUGH    PROGRAM    Rl     IS    THE    CORRELATION 

C  COEFFICIENT    FOR    PATTERN    SCORE    VS.    CRITERION.     SECOND 

C  TIME    THROUGH    Rl     IS    CORRELATION    COEFFICIENT    FOR    TOTAL 

C  CORRECT    VS.    CRITERICN. 

C 
30  R1=(N1*A2)-(A1*A1) 

R4=(N1*V)-{C1*A1) 

0=(R1*R3)**0.5 

Rl=R4/0 

IF(L.EQ.l)     R1STAR=R1 
C 
C 

C  COMPUTATION    OF    TEST    STATISTIC    FOR    R    DIFFERENCE 

C 
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Cross-Validation  Listing 
(Continued) 


Z1  =  Q+R1 l/ll-Rl ) 
Zl=DL0G(Zl)/2 
IF(L.EO.l)     GO    TO    40 
Z2=(l+R2)/(1-R2) 
Z2=DL0G(Z2)/2 
Z3=2./(Nl-3.) 
Z3=(Z3)#*0.5 
40         Z=(Z2-Z1)/Z3 

IF    (L.E0.1)    GO    TO    90 
C 

c 

C  PRINT    OUT 

C 

NUM2=1 

NUM3=50 

199  WRITE(6,200)(D(J), A(J ) , C( J ) , X( J ) , E ( J ) , J=NUM2, NUM3) 

200  FORMAT (• 1'  ,4(/)  ,T60, «ETST«  ,T67  ,  « EX AM « , 2 ( / ) ,T34, '  +  ' , 
160( •-'),«+',/, T34,  «  |  ' , IX, « I  DENT'  , IX, • I »  ,  IX, 

2'     PREDICTOR'  «1X,  ■  {  '  tlX-t  'CRITERION*  *  IX*  *  I  '  i 
3'    PATTERN    SCORE «  ,  IX  , «  I  '  , IX , « TOT AL    ONES '  , '  I •  ,/ ,T33  , 
4«     |',T9  5,«|',/'+«,T35,60(  •-•),/,  101  5  (  T34  , «  I  ■  ,15, 
52X,«  I '  ,T49,I2,T55, '  I '  ,T61, I2,T66, '  j  •  ,T71,F8.4, 
6T8  2, » I  « , T8  9,I1,T95, '  I •  ,/)  ,T34,'  I'  ,T9  5,'  I '  ,/'  +  •, 
7T35,60( «-• ),/) ) 

C 

C 

C      THE  NEXT  TWO  IF  STATEMENTS  CONTROL  THE  NUMBER 

C      AND  LENGTH  OF  THE  LAST  TABLE 

C 

c 
c 

C      THE  FIRST  'IF*  STATEMENT:  NUMBER  INSIDE  PAREN 

C      SHOULD  BE  ONE  MULTIPLE  OF  '5'  HIGHER  THAN  NlS  E.G.  IF 

C      Nl=627,  NUMBER  INSIDE  PAREN  SHOULD  BE  630;  IF  Nl=986, 

C      NUMBER  INSIDE  PAREN  SHOULD  BE  990. 

C 

IFCNUM3.E0.1190J  GO  TO  221 

NUM2=NUM2+50 

NUM3=NUM3+50 
C 

c 

C      SECOND  'IF*  STATEMENT:  NUMBER  INSIDE  PAREN  MUST  BE 

C      CNE  MULTIPLE  OF  50  +1  LOWER  THAN  Nl;  E.G.,  IF  Nl=1286, 

C      NUMBER  INSIDE  PAREN  SHOULD  BE  1251;  IF  Nl=126, 

C      NUMBER  INSIDE  PAREN  SHOULD  BE  101. 

C 

IF(NUM3.LT.1151)  GC  TO  199 
NUM2=1151 
NUM3=1190 
GC  TO  199 
221   A1=IA1 
A2=IA2 
V=IV 
209   WRITE(6,210)  R2,R1,Z 
210   FORMAT (• 1' , 'CORRELATION  AND  TESTS'//'  ', 
1'  R(PATTERN)  E OUALS  •  ,F 15. 7 , / , 
2«R(PREDICTCR)  EQUAL S ' , F 15. 7 , / , 
3'  Z  EQUALS' ,F15.7/ ) 
L=l 

GO  TO  30 
90    WRITE(6,212)  R1,Z 

212   FORMATC '0' , «R( TOTAL  CORRECT)  EQUALS' ,F1 5 . 7/ 
1 «  Z  EQUALS' ,F15.7/ ) 
C 
C 

C      THE  VARIABLE  R7  IS  THE  CORRELATION  COEFFICIENT 
C      BETWEEN  PATTERN  SCORES  AND  TOTAL  CORRECT  (ONES) 
C 

R7=N1*UU-IA1*X1 
WRITE(6,37)  R7,UU 
37    FORMAT!'  R7= • , F 1 8. 4 , «UU=* , Fl 8. 4) 
R1=(N1*A2)-:(A1*A1) 
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Cross-Validation  Listing 
(Continued) 

i 

R8={N1*X2)-(X1*X1 ) 
Q=(R1*R8)#*.5 

R7=R7/Q 

WRITE(6,213)    R7 
213       FORMATt '0« , 'R(PATTERN    SCORE/TOTAL    ONES)     EQUALS'  ,  F6  .3  ) 
C 
C 

C  RMUL    IS    THE    MULTIPLE    CORRELATION    COEFFICIENT 

C  TO    BE    USED    IN    THE    DETERMINATION    OF     «F'. 

C 
C 

RMUL=( (R2**2)+(R1STAR**2)-2*(R2*R1STAR*R7) )/(l-R7**2) 

RMUL  =  RMUL*-*.5 

FF={ { RMUL**2)-{R1STAR**2) )* (Nl-3 ) / (1-(RMUL**2 ) ) 

WRITE(6,417)    RMUL«FF 
417       FCRMATCO'  ,«  R(  MULT.    CCRREL.    COEF.)     EQUALS «, 4X, F6. 4, //, 
1«    FF    EQUALS' ,F8. 4) 

STOP 

END 
//G0.FTC6F001  DD  SPACE = ( CYL , ( 5 , 1 ) ) 
//GO.FT07FQ01  CD  SYSOUT=B 

//GO.FT04F0O1  DD  DSN  =  S0575  .KPW3-,  UN  IT  =  2321 ,  VOL=S  ER=CEL00  1, 
J  J  DCB=(RECFM=FB,BLKSLZE  =  2  000,LRECL  =  80) ,DISP=(CLD,KEEP) 

//GC.SYSIN  DD  * 
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APPENDIX  K 


Output  HI  -  Cross-Validation 

(First  50  Subjects) 
ETST  EXAM 


+ 

IDENT 

1   PREDICTOR 

CRITERION 

I  PATTERN  SCORE 

+ 

I  TOTAL  ONES 

2 

58 

62 

51.3182 

1 

4 

72 

76 

71.2954 

6 

6 

65 

67 

58.5000 

3 

8 

59 

52 

52.5238 

1 

10 

62 

61 

62.2308 

4 

12 

71 

76 

73.2857 

7 

14 

66 

67 

64.4000 

4 

16 

63 

62 

65.3000 

5 

18 

48 

36 

64.2500 

4 

20 

50 

52 

52.5833 

1 

22 

63 

66 

i      66.3333 

5 

24 

61 

53 

54.5533 

1 

26 

50 

56 

f      57.7308 

2 

28 

59 

65 

54.7500 

2 

3D 

62 

64 

64.4000 

4 

32 

61 

58 

54.7500 

2 

34 

71 

68 

65.6097 

5 

36 

65 

62 

62. 8621 

3 

38 

57 

58 

59.7143 

2 

40 

66 

62 

57.7308 

2 

42 

65 

58 

65.8571 

4 

44 

61 

56 

52.5833 

1 

46 

68 

66 

69.6667 

5 

48 

70 

75 

73.2857 

7 

50 

67 

59 

52.5238 

1 

52 

62 

61 

64„8000 

4 

54 

68 

68 

65.6397 

5 

56 

66 

65 

61.0000 

3 

58 

61 

66 

71.2954 

6 

60 

59 

54 

60.7333 

3 

62 

54 

55 

■      62.8621 
62.8621 

3 

64 

55 

52 

3 

66 

51 

58 

60.7333 

3 

68 

56 

51 

49.6071 

54.3000 

0 

7  0 

62 

57 

2 

72 

60 

1-      61 

62.230S 

4 

74 

73 

67 

68.7500 

6 

76 

65   • 

1  u 

73.2857 

7 

78 

52 

51 

49.0000 

2 

80 

63 

74 

66.7667 

5 

82 

66 

63 

54.0C00 

4 

84 

70 

72 

67.7273 

5 

86 

63 

64 

64.4000 

4 

88 

55 

51 

54.5500 

1 

90 

67 

69 

67.7273 

5 

92 

70 

71 

68.7500 

6 

94 

67 

76 

67.7273 

5 

96 

•  53 

64 

65.0  30  3 

4 

98 

60 

62 

57.3913 

2 

100 

72 

62 

67.7273 

5 
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APPENDIX  L 

GLOSSARY  OF 

COMPUTER  VARIABLES 

USED  IN  THESE  PROGRAMS 


A(J)  -  jth  subject's  GCT  score  used  as  a  predictor 

Al  -  sum  of  predictors 

A2  -  sum  of  squares  of  predictors 

B(J)  -  jjth_  subject's  decimal  value  of  his  binary  score 

C(J)  -  jth  subject's  final  school  grade  used  as  the  criterion 

CI  -  sum  of  the  criterion  scores 

C2  -  sum  of  squares  of  criterion  score 

D(J)  -  j_th_  subject's  identification  number 

E(J)  -  j_th_  subject's  total  correct 

F(,)  -  joint  frequency  distribution 

G(  )  binary  pattern  (2  replaced  O  in  output) 

H(  )  total  correct  in  a  binary  pattern 

IA1  -  sum  of  total  correct  (total  ones) 

IA2  -  sum  of  square  of  total  correct  (total  ones) 

IHI 

-  used  in  calculating  range  of  criterion  scores 
ILO 

IV  -  sum  of  C(J)*E(J) 

IW(  )-  a  column  on  a  subject's  record  card 

KDNUM  -    card  number 

M  -  column  in  'F  matrix' 
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Glossary  of  Computer  Variables  Used  in  These  Programs 

(Continued) 


N  -  row  in  'F  matrix' 

Nl  -  sample  size 

N2  -  number  of  elements  in  the  binary  pattern 

N3  -  range  of  criterion  scores 

N4      -    2**N2;  number  of  combinations  of  patterns  of  1/0  with  N2 
questions 

P(, )    -    jth_  subject's  pattern  of  ones/zeros 

Rl      -    correlation  coefficient  between  criterion/predictor 

2nd  time  corelation  coefficient  between  criterion/total  correct 

R2     -    correlation  coefficient  between  criterion /pattern 

R3-5-     correlation  coefficients  used  in  determining  Rl  and  R2 

R7      -    correlation  coefficients  between  pattern  scores  and  total  ones 

RMUL  -  multiple  correlation  coefficient 

S(I)    -    a  pattern  score  associated  with  a  particular  pattern 

51  -    weighted  sum  of  people  with  that  pattern  (weights  being  the 

criterion  scores) 

52  -    number  of  people  with  that  pattern 
V        -    sum  of  product  of  C(J)*A(J) 

W       -    sum  of  product  of  C(J)*X(J) 

W(l-7)  -  an  array  of  questions  being  used  in  this  study 

X(J)   -    j_th_  subject's  pattern  score 

XI      -    sum  of  pattern  scores 

X2     -    sum  of  square  of  pattern  score 
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Glossary  of  Computer  Variables  Used  in  These  Programs 

(Continued) 


test  statistic 

test  statistic  F  distribution 
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